Model-based imputation approach for data analysis in the presence of non-detects.

نویسندگان

  • K Krishnamoorthy
  • Avishek Mallick
  • Thomas Mathew
چکیده

A model-based multiple imputation approach for analyzing sample data with non-detects is proposed. The imputation approach involves randomly generating observations below the detection limit using the detected sample values and then analyzing the data using complete sample techniques, along with suitable adjustments to account for the imputation. The method is described for the normal case and is illustrated for making inferences for constructing prediction limits, tolerance limits, for setting an upper bound for an exceedance probability and for interval estimation of a log-normal mean. Two imputation approaches are investigated in the paper: one uses approximate maximum likelihood estimates (MLEs) of the parameters and a second approach uses simple ad hoc estimates that were developed for the specific purpose of imputations. The accuracy of the approaches is verified using Monte Carlo simulation. Simulation studies show that both approaches are very satisfactory for small to moderately large sample sizes, but only the MLE-based approach is satisfactory for large sample sizes. The MLE-based approach can be calibrated to perform very well for large samples. Applicability of the method to the log-normal distribution and the gamma distribution (via a cube root transformation) is outlined. Simulation studies also show that the imputation approach works well for constructing tolerance limits and prediction limits for a gamma distribution. The approach is illustrated using a few practical examples.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

چند رویکرد برخورد با مقادیر گمشده‌ متغیرهای کمی و بررسی اثر آنها بر نتایج حاصل از یک کارآزمایی‌ بالینی

Background and Objectives: A major challenge that affects the longitudinal studies is the problem of missing data. Missing in the data may result in the loss of part of the information which reduces the accuracy of the estimator and obtain the results will be biased and inaccurate. Therefore, it is necessary to evaluate the missing data mechanism from a longitudinal research and to consider thi...

متن کامل

Accuracy evaluation of different statistical and geostatistical censored data imputation approaches (Case study: Sari Gunay gold deposit)

Most of the geochemical datasets include missing data with different portions and this may cause a significant problem in geostatistical modeling or multivariate analysis of the data. Therefore, it is common to impute the missing data in most of geochemical studies. In this study, three approaches called half detection (HD), multiple imputation (MI), and the cosimulation based on Markov model 2...

متن کامل

An Empirical Comparison of Performance of the Unified Approach to Linearization of Variance Estimation after Imputation with Some Other Methods

Imputation is one of the most common methods to reduce item non_response effects. Imputation results in a complete data set, and then it is possible to use naϊve estimators. After using most of common imputation methods, mean and total (imputation estimators) are still unbiased. However their variances (imputation variances) are underestimated by naϊve variance estimators. Sampling mechanism an...

متن کامل

Estimating Returns to Scale in the Presence of Undesirable Factors in Data Envelopment Analysis

This research identifies returns to scale (RTS) of efficient decision making units (DMUs) with desirable (good) and undesirable (bad) inputs and outputs by presenting a new DEA (data envelopment analysis) approach. In this study, we first introduce a new input-output oriented model to determine efficient DMUs in the presence of undesirable factors and then, returns to scale of these DMUs are es...

متن کامل

Influence of Pattern of Missing Data on Performance of Imputation Methods: An Example from National Data on Drug Injection in Prisons

Background Policy makers need models to be able to detect groups at high risk of HIV infection. Incomplete records and dirty data are frequently seen in national data sets. Presence of missing data challenges the practice of model development. Several studies suggested that performance of imputation methods is acceptable when missing rate is moderate. One of the issues which was of less concern...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • The Annals of occupational hygiene

دوره 53 3  شماره 

صفحات  -

تاریخ انتشار 2009